Skip to content

Conversation

@tmleman
Copy link
Contributor

@tmleman tmleman commented Jan 16, 2026

Add Xtensa memory barrier (memw) instruction after memset() in memset_s() implementation to prevent compiler dead store elimination (DSE) optimization from removing the memory clearing operation.

When optimization flags like -O2 are enabled, compilers may perform dead store elimination and incorrectly remove memset() calls used for security purposes to scrub sensitive data from memory. This is critical for confidential data handling where memory must be reliably cleared after use.

The memory barrier ensures the memset operation completes and cannot be optimized away, satisfying secure memory scrubbing requirements for cryptographic operations and sensitive data processing.

Additionally, the patch removes the check for the return value of memset. The standard C library memset always returns the pointer passed as its first argument and does not indicate errors through its return value. Error handling for a NULL destination is already performed earlier in the function, so the return value check is unnecessary and can be safely omitted.

Add Xtensa memory barrier (memw) instruction after memset() in
memset_s() implementation to prevent compiler dead store elimination
(DSE) optimization from removing the memory clearing operation.

When optimization flags like -O2 are enabled, compilers may perform dead
store elimination and incorrectly remove memset() calls used for
security purposes to scrub sensitive data from memory. This is critical
for confidential data handling where memory must be reliably cleared
after use.

The memory barrier ensures the memset operation completes and cannot be
optimized away, satisfying secure memory scrubbing requirements for
cryptographic operations and sensitive data processing.

Additionally, the patch removes the check for the return value of
memset. The standard C library memset always returns the pointer passed
as its first argument and does not indicate errors through its return
value. Error handling for a NULL destination is already performed
earlier in the function, so the return value check is unnecessary and
can be safely omitted.

Signed-off-by: Tomasz Leman <[email protected]>
Copilot AI review requested due to automatic review settings January 16, 2026 14:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the security of the memset_s() function by preventing compiler optimizations from eliminating critical memory clearing operations. The change adds a memory barrier instruction for Xtensa architectures and removes unnecessary error checking.

Changes:

  • Added Xtensa-specific memory barrier (memw) after memset() to prevent dead store elimination
  • Removed unnecessary return value check from memset() call since it always returns the destination pointer

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +72 to 80
memset(dest, data, count);
/*
* Prevent compiler from optimizing away the memset.
* Memory barrier prevents dead store elimination.
*/
#if defined(CONFIG_XTENSA)
__asm__ __volatile__("memw" ::: "memory");
#endif

Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory barrier is only applied for Xtensa architecture, leaving other architectures vulnerable to dead store elimination. For security-critical memset_s() function, consider using a portable solution like a volatile function pointer or compiler builtins (e.g., memset_explicit() in C23, or explicit_bzero()) that work across all architectures to prevent compiler optimizations from removing the memory clearing operation.

Suggested change
memset(dest, data, count);
/*
* Prevent compiler from optimizing away the memset.
* Memory barrier prevents dead store elimination.
*/
#if defined(CONFIG_XTENSA)
__asm__ __volatile__("memw" ::: "memory");
#endif
/*
* Prevent compiler from optimizing away the memset.
* Use a volatile function pointer so the call has
* observable side effects on all architectures.
*/
void *(*volatile memset_ptr)(void *, int, size_t) = memset;
memset_ptr(dest, data, count);

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +79
#if defined(CONFIG_XTENSA)
__asm__ __volatile__("memw" ::: "memory");
#endif
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory barrier implementation creates architecture-specific code paths in a security function. Consider extracting this into a macro (e.g., SECURE_MEMORY_BARRIER()) defined in an architecture header file. This would improve maintainability and make it easier to add barriers for other architectures in the future.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmleman I think you can leave out CONFIG_XTENSA. i think this is asm notation is supported by all current SOF toolschains. Better for the build to fail if someone tries to build SOF on a more exotic toolchain.

lyakh
lyakh previously requested changes Jan 19, 2026
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the motivation, but I don't think this is the right place to fix this. I'd say, that memset_s() should have the same semantics as memset() plus a couple of argument checks. I don't think adding a memory barrier here is a good idea. It needlessly punishes most callers, while I think it's those "special" callers, where such an optimisation can take place and can be harmful. Also not sure it's the memw that prevents the compiler from optimising the code, I'd rather suspect that it's the __volatile__ attribute. Maybe just __asm__ __volatile__("" ::: "memory") would be enough? But again - not here, but in actual sensitive code sections, and maybe as a meaningful macro.

@tmleman
Copy link
Contributor Author

tmleman commented Jan 19, 2026

@lyakh Thank you for your feedback. I appreciate your point about the potential unnecessary overhead for most use cases. My initial thought was that adding the barrier directly to memset_s would be the safest approach, ensuring that memory scrubbing is always protected against compiler optimizations.

Alternatively, introducing a dedicated function or macro (e.g., secure_memzero()) for sensitive memory scrubbing is a good idea. My concern was that, during development/code reviews, developers might still use memset_s for scrubbing sensitive data and overlook the secure variant. However, if the team agrees, I’m open to moving the barrier to a specialized function and keeping memset_s lightweight.

Regarding the use of memw, it’s based on the Xtensa ISA documentation, which specifies that it enforces memory ordering at the hardware level.

Let me know your thoughts, and I’m happy to adjust the implementation as needed.

@tmleman tmleman requested a review from lyakh January 19, 2026 11:09
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the short term this fixes CC code elimination, but I think the volatile solution is a lot more portable. However, a volatile ptr may cause the memw to occur on each store (in the meset() loop) rather than once at the end of the memset(). This could vary by CC and build flags.
This PR is more direct and deterministic and forces the compiler to always do the right thing, yes there will be a perf hit but I dont think we are calling memset_s() from time critical code.

@lyakh
Copy link
Collaborator

lyakh commented Jan 20, 2026

Regarding the use of memw, it’s based on the Xtensa ISA documentation, which specifies that it enforces memory ordering at the hardware level.

@tmleman yes, it's for memory ordering, but is it what we need here? I thought we were trying to avoid erroneous compiler optimisation?

libc defines explicit_bzero() https://sourceware.org/glibc/manual/latest/html_node/Erasing-Sensitive-Data.html - looks like Zephyr libc implementations don't define it. Maybe sof_explicit_bzero()?

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good. OpenBSD and glibc have explicit_bzero() and C23 introduces memset_explicit().

I think our memset_s is supposed to implement C11 Annex K (optional, never caught on widely) style sementics and indeed that annex does state memset_s() should behave like memset_explicit() and guarantee the writes are not optimzied out. So given we use the "memset_s()" naming, it would be highly confusing if we did not follow the semantics. So +1 for this approach. I don't think the barrier is 100% reliable solution, but for xtensa and the toolchains currently used in SOF, this will work as expected.

Comment on lines +77 to +79
#if defined(CONFIG_XTENSA)
__asm__ __volatile__("memw" ::: "memory");
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmleman I think you can leave out CONFIG_XTENSA. i think this is asm notation is supported by all current SOF toolschains. Better for the build to fail if someone tries to build SOF on a more exotic toolchain.

@lyakh lyakh dismissed their stale review January 20, 2026 11:07

standard compliance verified

@tmleman
Copy link
Contributor Author

tmleman commented Jan 20, 2026

@kv2019i I was thinking more about extending this:

#if defined(CONFIG_XTENSA)
    __asm__ __volatile__("memw" ::: "memory");
#elif defined(__GNUC__) || defined(__clang__)
    __asm__ __volatile__("" : : "r"(ptr) : "memory");
#endif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants